Author Profiling: Predicting Age and Gender from Blogs Notebook for PAN at CLEF 2013
نویسندگان
چکیده
Author profiling is the task of determining age, gender, native language or personality type of author by studying their sociolect aspect, that is, how language is shared by people. In this paper, we propose a Machine Learning approach to determine unknown author’s age and gender. The approach uses three types of features: content based, style based and topic based. We were able to achieve an accuracy of 64.08%, 64.30% for age and 56.53%, 64.73% for gender in English and Spanish respectively.
منابع مشابه
Style-based Distance Features for Author Verification Notebook for PAN at CLEF 2013
In this paper we present the approach we took in our participation to the PAN 2013 Author Profiling task. It is an adaptation of our system submitted for author identification, assuming that a profile category (authors belonging to the same gender and age group categories) can be analyzed in the same way as an author’s style.
متن کاملUsing Simple Content Features for the Author Profiling Task Notebook for PAN at CLEF 2013
This paper describes the methods we have employed to solve the author profiling task at PAN-2013. Our goal was to use simple features to identify the age group and the gender of the author of a given text. We introduce the features, detail how the classifiers were trained, and how the experiments were run.
متن کاملAuthor Profiling using LDA and Maximum Entropy Notebook for PAN at CLEF 2013
This paper describes the traditional authorship attribution subtask of the PAN/CLEF 2013 workshop. In our attempt to classify the documents based on gender and age of an author, we have applied a traditional approach of topic modeling using Latent Dirichlet Allocation[LDA]. We used the content based features like topics and style based features like preposition-frequencies, which act as the eff...
متن کاملAutomatic Author Profiling Based on Linguistic and Stylistic Features Notebook for PAN at CLEF 2013
The rapid expansion of blog and electronic data in Web 2.0 is abounding and thus it is becoming important to identify the author‟s profile also. The problems of automatic identification of author‟s gender and age based on linguistic and stylistic pattern have been a subject of increasingly research interest in the recent years. The research methodologies are also helpful for several other appli...
متن کاملSemantic-based Features for Author Profiling Identification: First insights Notebook for PAN at CLEF 2013
In this article we present a semantic-based approach concerning the identification of particular author’s traits, such as age and gender, from social media texts. The model here described is intended to provide information on different levels of analysis: from textual markers to semantics. Different classifiers were used to assess the performance and scope of the model.
متن کامل